Extracting Paraphrases from Aligned Corpora

نویسنده

  • Ali Ibrahim
چکیده

The Problem: The expressiveness of human language allows people to express the same idea in many different ways; they may use different words to refer to the same entity or employ different phrases to describe the same concept. Thus, an effective information retrieval (IR) and question answering (QA) system must be equipped to handle these variations, both when processing documents and when fielding queries. While there are many resources that help systems deal with single word synonyms, there are few resources for multiple word paraphrases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extracting Structural Paraphrases from Aligned Monolingual Corpora

We present an approach for automatically learning paraphrases from aligned monolingual corpora. Our algorithm works by generalizing the syntactic paths between corresponding anchors in aligned sentence pairs. Compared to previous work, structural paraphrases generated by our algorithm tend to be much longer on average, and are capable of capturing long-distance dependencies. In addition to a st...

متن کامل

Extracting Lay Paraphrases of Specialized Expressions from Monolingual Comparable Medical Corpora

Whereas multilingual comparable corpora have been used to identify translations of words or terms, monolingual corpora can help identify paraphrases. The present work addresses paraphrases found between two different discourse types: specialized and lay texts. We therefore built comparable corpora of specialized and lay texts in order to detect equivalent lay and specialized expressions. We ide...

متن کامل

Applicability Analysis of Corpus-derived Paraphrases toward Example-based Paraphrasing

Two kinds of paraphrases extracted from a bilingual parallel corpus were analyzed. One is from an adjectival predicate sentence to a non-adjectival one. The other is from a passive form to a non-passive form. The ability to extract paraphrases is strongly desired for paraphrasing studies. Although extracting paraphrases from multi-lingual parallel corpora is possible, the type of paraphrases ex...

متن کامل

Acquiring lexical paraphrases from a single corpus

This paper studies the potential of extracting lexical paraphrases from a single corpus, focusing on the extraction of verb paraphrases. Most previous approaches detect individual paraphrase instances within a pair (or set) of comparable corpora, each of them containing roughly the same information, and rely on the substantial level of correspondence of such corpora. We present a novel method t...

متن کامل

Learning Sentential Paraphrases from Bilingual Parallel Corpora for Text-to-Text Generation

Previous work has shown that high quality phrasal paraphrases can be extracted from bilingual parallel corpora. However, it is not clear whether bitexts are an appropriate resource for extracting more sophisticated sentential paraphrases, which are more obviously learnable from monolingual parallel corpora. We extend bilingual paraphrase extraction to syntactic paraphrases and demonstrate its a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002